Organization of modular networks 
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We examine the global organization of heterogeneous equilibrium networks consisting of a number 
of well distinguished interconnected parts — "communities" or modules. We develop an analytical 
approach allowing us to obtain the statistics of connected components and an intervertex distance 
distribution in these modular networks, and to describe their global organization and structure. 
In particular, we study the evolution of the intervertex distance distribution with an increasing 
number of interlinks connecting two infinitely large uncorrelated networks. We demonstrate that 
even a relatively small number of shortcuts unite the networks into one. In more precise terms, 
if the number of the interlinks is any finite fraction of the total number of connections, then the 
intervertex distance distribution approaches a delta-function peaked form, and so the network is 
united. 

PACS numbers: 02.10.Ox, 89.20.Hh, 89.75.Fb 
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I. INTRODUCTION 



Many real-world networks contain principally distinct 
parts with different architectures. In this sense, they are 
strongly heterogeneous. For example, the Internet — the 
net of physically interconnected computers — is connected 
to mobile cellular networks. One should note that the 
issue of the network heterogeneity is among key prob- 
lems in the statistical mechanics of complex networks 
0, & H S H 1, 0- The question is how do the net- 
work's inhomogeneity influence its global structure? The 
quantitative description of the global organization of a 
network is essentially based on the statistics of n-th com- 
ponents of a vertex in the network, particularly, on the 
statistics of their sizes @, [ij [l(| ■ The n-th component of a 
vertex is defined as a set of vertices which are not farther 
than distance n from a given vertex. From this statis- 
tics, one can easily find less informative but very useful 
characteristics — the distribution of intervertex distances 
and its first moment, the average intervertex distance. 
In the networks with the small- world phenomenon, so- 
called "small worlds", the mean length of the shortest 
path J(N) between two vertices grows slower than any 
positive power of the network size N (the total number of 
vertices). Rather typically, £(N) ~ In TV. As a rule, in in- 
finite small worlds, a distribution of intervertex distances 
approaches a delta-function form, where the mean width 
5£ is much smaller than I. Moreover, in uncorrelated net- 
works, S£(N — > oo) — > const. So, in simple terms, ver- 
tices in these infinite networks are almost surely mutually 
equidistant. This statement can be easily understood if 
a network has no weakly connected separate parts [111 ]. 
In this paper we consider a contrasting situation. Our 
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networks are divided into a number of non-overlapping 
but interlinked subnetworks, say j = 1,2, ... ,m. What 
is important, we suppose that the connections between 
these subnetworks are organized differently than inside 
them, see Fig. Q] This assumption results in a global 
(or one may say, macroscopic) heterogeneity of a net- 
work. Using the popular term "community" , one can say 
that our networks have well distinguished communities 
or modules. Modular architectures of this kind lead to a 
variety of effects [H M, Q d • Figure [T] explains the 
difference between these modular networks and the well 
studied m-partite networks [H, [TtJ • 

In this work we analytically describe the statistics of 
the n-th components in these networks when all m com- 
munities are uncorrelated. For the sake of brevity, here 
we consider only the case of m — 2, i.e., of two net- 
works with shortcuts between them. As an immediate 



(a) 




(b) 




FIG. 1: (a) An example of a network, which we study in this 
paper in the case m = 3. The structure of interconnections 
between the three non-overlapping subnetworks differs from 
the structure of connections inside these subnetworks. More- 
over, the structures of the three subnetworks may differ, (b) 
A contrasting example of a 3-partite graph, where connections 
between vertices of the same kind are absent. 
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FIG. 2: Schematic view of the evolution of an intervertex 
distance distribution with the growing number of shortcuts 
between two large networks: (a) two separate networks; (b) 
two networks with a single shortcut between them; (c) two 
interconnected networks, when the number of shortcuts is a 
finite fraction of the number of edges in the networks. £\ and 
£2 are the average intervertex distances in the first and in the 
second networks, respectively. 



application of this theory we find a distribution of inter- 
vertex distances. We show how the global architecture of 
this (large) network evolves with an increasing number 
of shortcuts, when two networks merge into one. The 
question is: when is the mutual equidistance property 
realised? How general is this feature? Figure schemat- 
ically presents our result. The conclusion is that the 
equidistance is realized when the number of shortcuts is 
a finite fraction of the total number of edges in the net- 
work. This finite fraction may be arbitrary small though 
bigger than 0. In this respect, the large network becomes 
united at arbitrary small concentrations of shortcuts. 

In Sec. |TT] we briefly present our results. Section IIIII 
describes our general approach to these networks based 
on the Z-transformation (generating function) technique. 
In Sec. IIVI we explain how to obtain the intervertex dis- 
tribution for the infinitely large networks. In Sec. [V] we 
discuss our results. Finally, for the sake of clarity, in 
the Appendix we outline the Z-transformation approach 
in application to the configuration model of uncorrelated 
networks. 



II. MAIN RESULTS 

We apply the theory of Sec. IIIII to the following prob- 
lem. Two large uncorrelated networks, of Aq and N2 



vertices, have degree distributions Tli(q) and 112(g) with 
converging second moments. We assume that dead ends 
are absent, i.e., IIi(l) = n 2 (l) = 0, which guarantees 
that finite connected components are not essential in the 
infinite network limit (see Ref . [20L l2~0 ] ) . C edges inter- 
connect randomly chosen vertices of net 1 and randomly 
chosen vertices of net 2. For simplicity, we assume in this 
problem that C is much smaller than the total number 
of connections in the network. The question is what is 
the form of the intervertex distance distribution? In the 
infinite network limit, to describe this distribution, it is 
sufficiently to know three numbers: an average distance 
£1 between vertices of subnetwork 1, an average distance 
£2 between vertices of subnetwork 2, and an average dis- 
tance d between a vertex from subnetwork 1 and a vertex 
from subnetwork 2. These three numbers give positions 
of the three peaks in the distribution. Examining the 
variations of these three distances with C one can find 
when the equidistance property takes place. 
We introduce the following quantities: 



#i,2 = ^2q(q- i)ni, 2 (g, r)/q 1; . 



(1) 



q . r 



Here ^.2(9, r) are given distributions of vertices of intra- 
degree q and inter-degree r in subnetworks 1,2 (see 
SecHIIlfor more detail), qi and 52 are mean interdegrees 
of vertices in subnetworks 1 and 2, respectively. In terms 
of K\ and K2, the generalizations of the mean branching 
[£ in the standard configuration model, see Eqs. (|A.13|) 
in the Appendix] are 



Ci = iu 



N!N 2 qiq2 {Kt - K 2 )K( 



b = K 2 - 



(#1 - K 2 )Kl 



2 ■ 



(2) 



Here we assume that K\ > K 2 and d > £2- We also 
suppose that Cii C2 < 00. If the resulting average distance 
l\ between vertices of subnetwork 1 is smaller than the 
corresponding average distance £2 for subnetwork 2, then 
we obtain asymptotically 



InAi 
ln~Ci~ 



In < N2 



I11C2 
ln(2 



C! + cc 



(3) 
(4) 
(5) 



Here the constant C is determined by the degree distribu- 
tions ni(g) and n 2 (<;) and is independent of Ni, N 2 , and 
C. In formula Q, Cl = (iViA 2 /£) lnC2/lnCl . Note that 
these asymptotic estimates ignore constant additives. 
Formulas ©-([5]) demonstrate that if £ is a finite fraction 
of the total number of connections (in the infinite net- 
work limit), then £ 2 and d approach l\. The differences 
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are only finite numbers. Indeed, the second terms in re- 
lations (0| and (0 are finite numbers if N 2 /C — > const. 
[When £ is a finite fraction of the total number of con- 
nections, <C >C in Eq. Q.] On the other hand, when 
C is formally set to 1, relation §5$ gives d = l\ + £ 2 , 
see Fig. [2jb). Furthermore, assuming C/Nij — > as 
iVi 2 — * oo, we have l 2 = \nN 2 /\n( 2 according to rela- 
tion Q, since C| = (A^iA^2/£) lnC2/lnCl > C. 

We also consider a special situation where subnetworks 
1 and 2 are equal, so K\ = K 2 = K, qi = q~2, and N% = 
N 2 = N. In this case the mean branching coefficients are 



C1.2 =K± 



C 1 

Nq ± K' 



(6) 



With these (j and (2, the mean intervertex distances 
have the following asymptotics: 1\ = £ 2 — In A 7 / In £1 
and d = ~l\ + ln(V/£)/ln£ 2 - Formally setting £ to 1 we 
arrive at d = 2£i — 2£ 2 . One should stress that all the 
listed results indicate a smooth crossover from two sep- 
arate networks to a single united one: there is no sharp 
transition between these two regimes. 



III. STATISTICS OF MODULAR NETWORKS 

We consider two interlinked undirected networks, one 
of N l7 the other of N 2 vertices. The adjacency matrix of 
the joint network, g, has the following structure: 

9i h 
h T 92. 

Here g\ = g[ and g 2 = g 2 are Ni x Ni and N 2 X N 2 
adjacency matrices of the first and of the second subnet- 
works, respectively, and his N±x N 2 matrix for intercon- 
nections. We use the following notations: latin (greek) 
subscripts i, j, etc. (a, (3, etc.) take values 1, 2, . . . , A/i 
(Ni + 1, Ni + 2, . . . Nx + N 2 ). So, gji, gp a , g ja and g pi 
are the matrix elements of <?i, g 2 , h and h T , resp. We 
assume the whole network to be a simple one, i.e., the 
matrix elements of g are either or 1, and the diagonal 
ones are all zero, ga = g aa = 0. 

Every vertex in this network has intra-degree and 
inter-degree. Vertex i belonging to subnetwork 1 has 
intra-degree q l = Y^j 9ji and inter-degree r 4 = Y,a9m- 
Vertex a belonging to subnetwork 2 has intra-degree 
1c = J2p9f3a and inter-degree r a = TV g ja . The to- 
tal numbers of intra- and interlinks are 2Li = JV i gij, 

2L 2 = E/3,a 9p a and C = £V a g ja = J2p,i 9m- 

We introduce a natural generalization of the configura- 
tion model (we recommend that a reader look over Ap- 
pendix to recall the configuration model and the stan- 
dard analytical approach to the statistics of its com- 
ponents). In our random network, intralinks in sub- 
networks 1 and 2 are uncorrelated, and the set of in- 
terlinks connecting them is also uncorrelated. As in 



the configuration model, our statistical ensemble in- 
cludes all possible networks with given sequences of intra- 
and inter-degrees for both subnetworks. All the mem- 
bers of the ensemble are taken with the same statisti- 
cal weight. Namely, there are A/1,2 (Ni, N 2 ; q, r) vertices 
in subnetworks 1,2 of intra-degree q and inter-degree 
r. Here £ g r M,2 (Ni, N 2 ; q, r) = N h2 . The condition 

E g , r » , M(JVi,JV 3 ;g J r) = T, q ^{N 1 ,N 2 ;q,r) = C, 
where C is the number of interlinks, should be ful- 
filled. We assume that in the thermodynamic limit 
N\ — > 00, V2 — + 00, N 2 /Ni — ► k <C 00, we have 
A/1,2 (Ni, N 2 ;q,r) /N ia -» Hi, 2 (g,r), where Hi and n 2 
are given distribution functions. Again, there is a condi- 
tion that the number of edges from subnetwork 1 to 2 is 
the same as from 2 to 1: 



E 



mi (g, 



00 
q,r—0 



(7) 



Here r\ >2 are average inter-degrees of the vertices in sub- 
networks 1 and 2. 

The theory of uncorrelated networks extensively uses 
the Z-representation (generating function) of a degree 
distribution: 



■ (z)=£n(g): 



(8) 



g=0 



Here we introduce 



k,»(x,y)= J2 ni, 2 (9,r)^V 

q,r—0 



(9) 



In Z-representation, the average intra- q~\ and q 2 and 
inter- fi and f 2 degrees of subnetworks 1 and 2, respec- 
tively, are 



91,2 = 



dx 



d<t>i,2{x,y) 



x=y=l 



dy 



x=y=l 



(10) 



Let {(3,cz), {j,ot) and (f3,i) be ordered vertex 

pairs. Let us name their elements in the first and second 
position as final and initial, respectively. An end vertex 
degree distribution is the conditional probability for the 
final vertex of some (randomly chosen) ordered pair of 
vertices to have intra- and interdegrees q and r, respec- 
tively, provided the vertices in this pair are connected by 
an edge. We have four distributions, each one depending 
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on two variables: 

p i(l, r) = — (9ji5(qj - 1 - q)S{rj - r)> , 

p 2(q, r) = — ^ (9p a 5{qp - 1 - q)6(rp - r)) , 

2 P,a 

Qx(q,r) = j;^2(9ja5(qj - q)5(rj - 1 - r)) , 

3, a 



Taking into account definitions of vertex degrees, we have 

Pi,2 (q, r) = (q+ 1) Ei, a (q + 1, r) /«i,a, 

Qi,2 (9, r) = (r + 1) K 1)2 (<z, r + 1) /n, 2 . (12) 

In Z-representation these distribution functions take the 
following forms: 



'/2 



n 1 d<f> 2 (x,y) 



r 2 dy 



(13) 



Let us introduce the n-th components of ordered vertex 
pairs, C n .ji, C n _p a , C n j a and C n ^i. These components 
are sets, whose elements are vertices. As is natural, the 
components are empty, if the vertices in a pair are not 
connected. The first component is either one-element set 
consisting of the final vertex, or empty set. For example, 
Ci t pi is either vertex (3 or 0. The second component, 
if nonempty, contains also all the nearest neighbours of 
the final vertex, except the initial one, and so on. We 
have four types of the components of an edge: C n ji, 
C n ,pai C n} j a and C„ t pi. They are defined in a recursive 
way similarly to the standard configuration model (see 
Appendix). Each of these four n-th components itself 
consists of two disjoint sets: one of vertices in subnetwork 
1, the other — in subnetwork 2. For example, C n ^i = 



r (l) ,, r (2) 
n,ji n,ji ' 



The sizes of the components are M n 



(i) 



C 



(i) 

n,jx 



, etc. 



Taking into account the locally tree-like structure of our 



network gives 



M 



(1,2) 



9ji 



M 



(1,2) 



9ja 



E 



E^-i^ + E^-i 



7/3 



7#« 



E^-iVE^-i! 

k I7^ a 



13 



E^-i^+E^-i^ 



(14) 



The configuration model is uncorrelated random net- 
work. So all the terms on the right-hand side of each 
of the four equations (|14p are independent random vari- 
ables. Quantities within each of two sums in these equa- 
tions are equally distributed. Their statistical properties 
are also independent of the degree distribution of the ini- 
tial vertex of the edge, i.e., of j or j3. 

The sizes of the connected components of an edge in 
different networks [e.g., Af^L and M n 2 ^ k ] are, generally, 
correlated. So we introduce four joint distribution func- 
tions of the component sizes in different networks. In 
Z-rcprcscntation they are defined as follows: 



a .,,->■ n ^y n -» ) , 



^n } (x,y) = — (y^gpaX™^ 



(i) M (2) 



c 



M (1 > M< 2 > 



(15) 



0,i 



The recursive relations for these distributions are 
straightforward generalization of a relation for a usual 
uncorrelated network, without modularity [see Eq. (|A.7|) 
in the Appendix] 







= x£i 


(x,y), 


9 n-i i x ,y) 




(x,y) 


= yii 




t-i{x,y) 




(x,y) 




^n-i (x,y), 


9 { nlAx,y) 




(x,y) 


= vm 




€\(x,y) 



(16) 
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The 7i-th component C n> i (C n>a ) of vertex i (a) contains 
all vertices at distance n from vertex i (a) or closer. Let 



M 



(1,2) 



(1,2) 



and Mn'a 



,-(1,2) 



be sizes of the 



components [C' 1 ' and are the subset of C, containing 
vertices of the first and second networks, respectively]. 
Using the locally tree-like structure of the network and 
absence of correlations between its vertices, we obtain 
the Z-transform of the joint distributions of component 
sizes: 

= x<t>i (x,y),6W , 

= yct> 2 [i/,W(x,y),0W (*, y)} ■ (17) 

The conditional average sizes of the components are 
expressed in terms of the derivatives of the correspond- 
ing distribution functions at the point x = y = 1. For 
example, the conditional average component sizes for an 
internal vertex pair in network 1 are expressed as follows: 
for the part, which belongs to the first network it is 



9iM%)/( gij ) 



dipn ] (x, y) 



dx 



(18) 



x=y=l 



for the component part in network 1, and 



X?(2U) / . ,(2) \ , , , dip ( n ) (x,y) 



dy 



(19) 



x=y=l 



for the part of the component, which is in network 2. 
Here, (i) the first superscript index indicates whether the 
component is in subnetwork 1 or 2, (ii) the second su- 
perscript index indicates whether the final vertex is in 
subnetwork 1 or 2, and (iii) the third superscript index 
indicates whether the initial vertex is in subnetwork 1 or 
2. For the components of a pair with initial vertex in 
network 2 and final in network 1 we have: 



M n = \9i a M'> a ) / {g ia } = 



dx 



(20) 



x=y=l 



and 



^(212) / (2) \ , , > d6 { n ] (x, y) 
M n = (g la Ml> a )/ (g ia ) = 



dy 



x=y=l 



(21) 
and so on. 

Using Eqs. (|16p one can derive recurrent relations for 
the average values of M n and M n -%. We introduce a pair 



of four-dimensional vectors: 
/Mi 1U) \ 



Mi 1 ) = 



M (112) 



Mi 2 ) 



/m1 211) \ 

M (221) 

^t-f(222) 

V M » 7 



(22) 



Then the recurrent relations take the forms: 



Ml 1 ) = CM« 1 
where 



mi , Mi 2 )=CMi 2 2 1 



m 2 



/6i o 6i 

7711 7721 

7722 7/12 

V 62 62 



Here 

£,fj,u — d^ v (x, y) \ x _y =1 



(23) 



(24) 



?» = dyftv (x,y)\ 



x—y—l * 



/I, V 

r(2) 



1,2. The initial conditions are M 



(i) 



mi, 



' = ni2. As for the average sizes of the n-th compo- 



nents of vertices, they are 



54-i 1 ) (x, y) 



dx 



x=y=l 



- 17 (111) 
qiM n 



-"iMi 112) 




(x, y) 



dy 



x=y=l 



= giM 



(211) 



fiM 



(212) 



(12) 




(x,y) 



dx 



x=y=l 



—(122) —(121) 



M 



(22) 



5^£ 2) (or, y) 



1 



dy 



x=y=l 



- If* 222 ' 



- tj( 221 ) 

r 2 M n 



(25) 



Here, (i) the first superscript index oi M. n indicates 
whether the component is in subnetwork 1 or 2, and (ii) 
the second superscript index indicates whether a mother 
vertex is in subnetwork 1 or 2. Recall that q~\ and q 2 are 
the mean numbers of internal connections of vertices in 
subnetworks 1 and 2, respectively; f\ is a mean number 
of connections of a vertex in subnetwork 1 , which go to 
subnetwork 2; and finally f 2 is a mean number of connec- 
tions of a vertex in subnetwork 2, which go to subnetwork 
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1 . Relations (|23|) and (|25|) allow us to obtain the average 
sizes of all components. 

Example. — Since formulas in this section are rather 
cumbersome, to help the readers, we present a simple 
demonstrative example of the application of these re- 
lations. Let us describe the emergence of a giant con- 
nected components in a symmetric situation, where both 
subnetworks have equal sizes and identical degree dis- 
tributions IIi^O?,?*) = n(g, r). In this case, £i(x,y) = 
&(x,y) = t(x,y) and Vi(x,y) = V2(x,y) = r](x,y). 
Also, tp^(x,y) = ilj^{x,y) = tp(x,y) and 9^(x,y) — 
9^ (x, y) = 9(x, y). So the relative size S of a giant con- 
nected component takes the form: 

S =l-cf>(t,u), (26) 

where t = ip(l, 1) and u = 9(1, 1) are non-trivial solutions 
of the equations: 

t = £(«,«), u = r)(t,u). (27) 

For example, let the subnetworks be classical random 
graphs, and each vertex has no interlinks with a probabil- 
ity 1 — p and has a single interlink with the complimentary 
probability p. That is, 

U( q , r ) = e-^[(l-p)S qfi + P S ga ], (28) 

where q is the mean vertex intra-degree, so <fi(x, y) = 
e^ x -V[l-p + py]. 

For a single classical random graph with vertices of 
average degree q, Eqs. (|A.15|) and (|A.17|) give the point 
of the birth of a giant connected component, q = q c = 1, 
and the relative size of this component S = 2(q — 1) in 
the critical region. 

Let us now find the birth point q = q c (p) and the 
critical dependence S(q,p) in the modular network. For 
this network, we find £,(x,y) = d x (j)(x,y) / q — 4>(x,y) 
and r](x,y) = d y 4>(x,y)/f = e^ x_1 ). Substituting these 
functions into Eqs. (|2"6"|) and (j2"T|) directly leads to the 
result: 

1 + p 1 + 3p \ 1 + Jv 

compare with a single classical random graph. 



IV. INTERVERTEX DISTANCE 
DISTRIBUTION 

As was explained in Sec. [TH the intervertex distance 
distribution in the thermodynamic limit is completely 
determined by the three mean intervertex distances: i\ 
for subnetwork 1, £2 for subnetwork 2, and d for pairs 
of vertices where the first vertex is in subnetwork 1 and 
the second is in subnetwork 2. The idea of the computa- 
tion of these intervertex distances is very similar to that 
in the standard configuration model, see the Appendix, 



Eq. (|A.18[) . However, the straightforward calculations for 
two interconnected networks are cumbersome, so here we 
only indicate some points in our derivations without go- 
ing into technical details. 

The calculations are based on the solution of recursive 
relations (|2"3")l . As is usual, these relations should be in- 
vestigated in the range 1 - j: C 1, 1 - y < 1 of the 
Z-transformation parameters. Fortunately, the problem 
can be essentially reduced to the calculation of two high- 
est eigenvalues of a single 4x4 matrix. The resulting 
eigenvalues Ci and C2 for networks with C/N\$ <C 1 are 
given by formulas |2j) and ([6]). The n-th component sizes 
are expressed in terms of these eigenvalues. The lead- 
ing contributions to the n-th component sizes turn out 
to be linear combinations of powers of the mean branch- 
ings: A(i + ■ The factors A and B do not depend on 
n. For example, when £1 > £2, the main contributions 

to and M^ 1] look as Ci + [C 2 / (N^)]^ and 

(£/Ni)(%, respectively. Here we omitted non-essential 
factors and assumed a large n. This approximation is 
based on the tree ansatz, that is on the locally tree-like 
structure of the network. This ansatz works when the n- 
th components are much smaller than subnetworks 1 and 
2. So the intervertex distances are obtained by compar- 
ing the sizes of relevant n-th components of vertices with 
Ni and N%. Since networks 1 and 2 are uncorrelated, 
this estimate gives only a constant additive error which 
is much smaller than the main contribution of the order 
of lniVi ; 2- (See Ref. [Ic| for complicated calculations be- 
yond the tree ansatz in the standard configuration model, 
which allow one to obtain this constant number.) 

One should emphasize an additional difficulty specific 
for the networks under consideration. The problem is 
that in some range of Ni and N2 , and Ci and C2 , while an 
n-th component in, say, network 1 is already of size ~ N\ 
(failing tree ansatz), the corresponding n-th component 
in network 2 is still much smaller than A^. In terms 
of Sec. IIII1 this, e.g., means that there exists a range of 

n, £ 1 <n<d, where M^^Nx but still <- N 2- Com- 
(21) 

puting M. n in this regime, we use the tree ansatz, while 

M n is set to N±. This approximation also produces 
only a constant additive error which one may ignore in 
these asymptotic estimates. 



V. DISCUSSION AND CONCLUSIONS 

A few points should be stressed. 

(i) In Section IIIII we derived relations for the Z- 
transformation of the distributions of n-th components. 
Quite similarly to the standard configuration model (see 
Appendix), using these formulas with n — > 00 readily 
gives corresponding relations for the statistics of finite 
connected components and for the size of a giant con- 
nected component. Note that when subnetworks 1 and 2 
are uncorrelated, which is our case, finite connected com- 
ponents are essential in an infinite network only if there 
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is a finite fraction of vertices of degree 1. 

(ii) The theory of Sec. [TTT]is essentially based on the lo- 
cally tree-like structure of networks under consideration. 
In principle one can go even beyond the tree ansatz as was 
done for the standard configuration model in Ref. [Iol |. 
This is a challenging problem for these networks. Since 
we extensively used the tree approximation in Sec. IIV1 
our results for the intervertex distances are only asymp- 
totic estimates. 

(iii) For the sake of brevity, we obtained relations only 
for networks with two interlinked subnetworks, but it is 
not a restriction. A generalization to networks with an 
arbitrary number of interlinked subnetworks is straight- 
forward. The final relations in Sec. IIIII can be readily 
generalized without derivation. Generalization to struc- 
tured networks with degree-degree correlations is also 
clear. Note that, in particular, our theory can describe 
multi-partite networks, whose subnetworks have no intra- 
connections. Based on equations derived for the configu- 
ration model [22I [23j (for fc-cores in real- world networks, 
see Refs. [U, one can also generalize this theory to 
describe the k-core organization of modular networks 

(iv) As an application, we considered interlinked net- 
works with a relatively small number of shortcuts. Note 
however that our general results in Sec. lIIII do not assume 
this restriction. 

In summary, we have developed an analytical approach 
to the statistics of networks with well distinguished com- 
munities. We have derived general relations allowing one 
to find the distributions of the sizes of connected compo- 
nents in these networks. As a particular application of 
this theory, we have obtained asymptotic estimates for 
the distribution of intervertex distances in two weakly 
interconnected uncorrelated networks. We have shown 
that in the infinite network limit, vertices in this net- 
work are almost surely equidistant if the relative number 
of interlinks is any finite number. Our approach can be 
applied to a number of other problems for networks of 
this sort, including the birth of a giant connected com- 
ponent, percolation, and others. 
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APPENDIX: STATISTICS OF THE 
CONFIGURATION MODEL 

For the sake of clarity, here we outline the Z- 
transformation (generating function) technique in appli- 
cation to the standard configuration model of an uncor- 



(a) (b) 




FIG. 3: The first (a) and the second (b) components of edge 
{if). Filled vertices belong to the components. The 0-th 
component is empty. 

related graph with a given deg ree distribution H(q). For 
more detail, see Refs. pa, V3L llOj] . In simple terms, the con- 
figuration model [3, [ll| is a maximally random graph 
with a given degree distribution. In graph theory it is also 
called a random graph with a given degree sequence. 

Graph of size N consists of a set of vertices Vi, i — 
1, 2, . . . , N, connected by edges e^. An edge exists if 
the adjacency matrix element g i = 1. We start from the 
following distribution: 

s («) = ^Z>iM<&-i-«)^> (A.i) 

where 8k is the Kronecker symbol. This is the proba- 
bility that a randomly chosen end of a randomly chosen 
edge in the graph has branching q. Alternatively, it may 
be considered as conditional probability for final vertex 
j in a randomly chosen ordered pair (j, i) to have de- 
gree q + 1, provided vertices are connected by an edge. 
Obviously, 

where q = (q) is the average degree of a vertex. In the 
Z-representation this relation takes the form: 

9 =o \j=i I q 

(A.3) 

Note that q= (f>'(l). 

Let the n-th component of the ordered pair (J, i), C n ji 
be the following set of vertices. For any (ordered) pair of 
vertices (j, i), C\ji is vertex j if vertices are connected, 
else Ciji = 0. For n > 1, C n ,ij is defined recursively 
as follows. If gji = 0, all C n .ji = 0- Otherwise, in 
@2,ij there are also qj — 1 other vertices, connected to the 
vertex j, the third component C^-y contains also all other 
vertices, connected with ones of the second component, 
and so on, see Fig. [3] 

In the thermodynamic limit (N — > 00) almost every 
finite n-th component of uncorrelated random graph is 
a tree. Then for the sizes (numbers of vertices) of the 
components, M n jj = \C n ,ij \ we have (assuming = 1): 

M ndi = 1 + J2 M n-i,kj , (A.4) 
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with the initial condition Miji = 1. Due to the ab- 
sence of correlations in the configuration model, M n ^j 
and M n> ij, k ^ I, are independent equally distributed 
random variables. We define the distribution function of 
the n-th component of an edge as 



P 



1 N 

™( M ) = X (9^(M n>ij - M)> 
iV(iV- 1) 



2L 



-{ gji 5{M n>ji -M)) 



1 



( gji S(M ntji - M)>. 



(A.5) 



It is more convenient to use the Z-transformation of this 
distribution: 



<7y x 



1 



(A.6) 

Substituting Eq. (|A.4|) into Eq. (|A.6|) and using Eq. (IA.3D 
gives 



JV 



N 



= 2L ( £ * ^-x {xT~ l ) = x^> (x)} . (A.7) 



Let C Ut i be the n-th component of vertex Vi- This com- 
ponent includes all vertices at distance n or closer from 
vertex Wj. (The 0-th component of a vertex is empty.) 
Due to the absence of loops (tree-like structure) we have 
the following relation for the size of n-th component of 
vertex v it M H}i = |C, M |, 



M n ,i = l + J2 M n-l, 



(A. 



So the n-th component size distribution 



Pn(M) = ±{J25(M n , i -M)) (A.9) 



is expressed in Z-representation as 



*« (x) = 1 ( xMn " ) = [Vfc-i (*)] • ( A -!0) 



The average sizes of subsequent n-th components are 



related through the following equations: 

M n = *; (1) = 1 + qM n -u (A.ll) 
M„ = l + CM„-i, (A.12) 



where 



9=0 



(A.13) 

which is the mean branching. If £ < 1, both M n and M n 
have finite limits as n — ► oo. That is, the network has no 
giant connected component. If £ > 1, a giant connected 
component exists. 

Assuming ip n — ip n —i = ip in Eq. (|A.7|I . we obtain 
an equation for the distribution function of the sizes of 
edge's connected components, 



ip(x) = x<j>[i/j(x)], 



(A.14) 



which implicitly defines ip(x). If £ > 1, this equation has 
two solutions at x = 1. One is "0(1) = 1, the other is 
some -0(1) = t < 1, 



(A.15) 



For any value of £, ^„ (1) = 1. On the other hand, if 
C > 1, linxr-^i-o lim ri _, 00 ifi n (x) = t < 1. This is the 
probability that the connected component of a randomly 
chosen edge is finite. Then the probability that randomly 
chosen vertex belongs to a finite connected component of 
the graph is 



X;n(?)f = 0(t). 

9=0 



(A.16) 



Therefore the number of vertices in the giant connected 
component in the thermodynamical limit is 



Moo =N[l-<t> (t)] 



(A.17) 



One may find an intervertex distance distribution from 
the mean sizes of the n-th components of a vertex, see 
Ref . H HI] . The diameter J of the giant connected com- 
ponent, i.e., the distance between two randomly chosen 
vertices, is obtained from the relation Q ~ Moo ~ N. So, 
if the second moment of the degree distribution is finite, 



\naN 
InC ' 



(A.18) 



where a is some number of the order of 1. For more 
straightforward calculations, see Refs. fiol. 
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